计算机与现代化 ›› 2011, Vol. 1 ›› Issue (11): 59-5.doi: 10.3969/j.issn.1006-2475.2011.11.016

• 中文信息技术 • 上一篇    下一篇

使用进化神经网络进行文本自动分类

耿俊成1,牛霜霞1,张才俊2   

  1. 1.河南电力试验研究院智能电网研究所,河南郑州450052; 2.安阳供电公司科技信息部,河南安阳455000
  • 收稿日期:2011-05-10 修回日期:1900-01-01 出版日期:2011-11-28 发布日期:2011-11-28

Text Automatic Categorization with Evolutionary Neural Network

GENG Jun-cheng1, NIU Shuang-xia1, ZHANG Cai-jun2   

  1. 1.Smart Grid Institute, Henan Electric Power Research Institute, Zhengzhou 450052, China; 2.Department of Information Technology, Anyang Power Supply Company, Anyang 455000, China
  • Received:2011-05-10 Revised:1900-01-01 Online:2011-11-28 Published:2011-11-28

摘要:

人工神经网络是一种有效的文本分类技术,但网络本身的不确定性使得很难找到合适的网络。本文提出粒子群优化算法优化神经网络,使得该网络在进化过程中自适应地调节其连接权重和网络结构。首先把文本集合表示为向量空间;然后使用信息增益算法选择特征项,使用特征项频率-倒排文档频率计算特征项权值;最后使用进化神经网络对中文文本进行自动分类。实验结果表明,与原BP神经网络相比,进化BP神经网络的分类效果更好。

关键词: 文本分类, 信息增益, 特征项频率-倒排文档频率, 神经网络, 粒子群优化算法

Abstract:

The artificial neural network is an effective method of text categorization. However, the uncertainty of the network makes it difficult to find a suitable network. This paper uses the particle swarm optimization algorithm to optimize neural network, makes it adaptive to adjust its connection weights and network structure in the evolutionary process. First, represents the text set as a vector space, and then uses the information gain algorithm to select feature items, uses the term frequencyinverse document frequency to calculate its weight.Finally, uses the evolutionary neural network to categorize the Chinese text automatically. Experimental results show that comparing with the original BP neural network, the classification of evolutionary BP neural network is better.

Key words: text categorization, information gain, term frequencyinverse document frequency, neural network, particle swarm optimization algorithm